Goto

Collaborating Authors

 target variable



e0af79ad53a336b4c4b4f7e2a68eb609-Paper-Conference.pdf

Neural Information Processing Systems

Humans have a powerful and mysterious capacity to reason. Working through a set of mental steps enables us to make inferences we would not be capable of making directly even though we get no additional data from the world. Similarly, when large language models generate intermediate steps (a chain of thought) before answering a question, they often produce better answers than they would directly. We investigate why and how chain-of-thought reasoning is useful in language models, testing the hypothesis that reasoning is effective when training data consists of overlapping local clusters of variables that influence each other strongly. These training conditions enable the chaining of accurate local inferences to estimate relationships between variables that were not seen together in training.


Targeted Sequential Indirect Experiment Design

Neural Information Processing Systems

Scientific hypotheses typically concern specific aspects of complex, imperfectly understood or entirely unknown mechanisms, such as the effect of gene expression levels on phenotypes or how microbial communities influence environmental health. Such queries are inherently causal (rather than purely associational), but in many settings, experiments can not be conducted directly on the target variables of interest, but are indirect. Therefore, they perturb the target variable, but do not remove potential confounding factors. If, additionally, the resulting experimental measurements are high-dimensional and the studied mechanisms nonlinear, the query of interest is generally not identified. We develop an adaptive strategy to design indirect experiments that optimally inform a targeted query about the ground truth mechanism in terms of sequentially narrowing the gap between an upper and lower bound on the query. While the general formulation consists of a bi-level optimization procedure, we derive an efficiently estimable analytical kernel-based estimator of the bounds for the causal effect, a query of key interest, and demonstrate the efficacy of our approach in confounded, multivariate, nonlinear synthetic settings.






e8a642ed6a9ad20fb159472950db3d65-Supplemental.pdf

Neural Information Processing Systems

Methods for handling missing data has been extensively studied in the past few decades. Those methods can be roughly classified into two categories: complete case analysis (CCA) based, and imputationbasedmethods. CCAbasedmethods,suchaslistwisedeletion[1]andpairwisedeletion [31] focuses on deleting data instances that contains missing entries, and keeping those that are complete. Standardtechniquesof single imputation include mean/zero imputation, regression-based imputation [1], no-parametric methods [15,54]. For the factorized priorp(Z|U) of the i-VAE component of GINA, we used 15 a linear network with one auxiliary input (which is set to be fully observed dimension,X1).